Search CORE

73 research outputs found

Real-time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification

Author: Ai Haizhou
Chen Long
Shang Chong
Zhuang Zijie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/09/2018
Field of study

Online multi-object tracking is a fundamental problem in time-critical video analysis applications. A major challenge in the popular tracking-by-detection framework is how to associate unreliable detection results with existing tracks. In this paper, we propose to handle unreliable detection by collecting candidates from outputs of both detection and tracking. The intuition behind generating redundant candidates is that detection and tracks can complement each other in different scenarios. Detection results of high confidence prevent tracking drifts in the long term, and predictions of tracks can handle noisy detection caused by occlusion. In order to apply optimal selection from a considerable amount of candidates in real-time, we present a novel scoring function based on a fully convolutional neural network, that shares most computations on the entire image. Moreover, we adopt a deeply learned appearance representation, which is trained on large-scale person re-identification datasets, to improve the identification ability of our tracker. Extensive experiments show that our tracker achieves real-time and state-of-the-art performance on a widely used people tracking benchmark.Comment: ICME 201

arXiv.org e-Print Archive

Crossref

Learning Lightweight Pedestrian Detector with Hierarchical Knowledge Distillation

Author: Ai Haizhou
Chen Long
Chen Rui
Shang Chong
Zhuang Zijie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/09/2019
Field of study

It remains very challenging to build a pedestrian detection system for real world applications, which demand for both accuracy and speed. This work presents a novel hierarchical knowledge distillation framework to learn a lightweight pedestrian detector, which significantly reduces the computational cost and still holds the high accuracy at the same time. Following the `teacher--student' diagram that a stronger, deeper neural network can teach a lightweight network to learn better representations, we explore multiple knowledge distillation architectures and reframe this approach as a unified, hierarchical distillation framework. In particular, the proposed distillation is performed at multiple hierarchies, multiple stages in a modern detector, which empowers the student detector to learn both low-level details and high-level abstractions simultaneously. Experiment result shows that a student model trained by our framework, with 6 times compression in number of parameters, still achieves competitive performance as the teacher model on the widely used pedestrian detection benchmark.Comment: Accepted at ICIP 2019 as Ora

arXiv.org e-Print Archive

Crossref

Real time facial expression recognition with AdaBoost

Author: Bo Wu
Chang Huang
Haizhou Ai
Yubo Wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

In this paper, we propose a novel method for facial expression recognition. The facial expression is extracted from human faces by an expression classifier that is learned from boosting Haar feature based Look-Up-Table type weak classifiers. The expression recognition system consists of three modules, face detection, facial feature landmark extraction and facial expression recognition. The implemented system can automatically recognize seven expressions in real time that include anger, disgust, fear, happiness, neutral, sadness and surprise. Experimental results are reported to show its potential applications in human computer interaction

CiteSeerX

Crossref

Baichuan 2: Open Large-scale Language Models

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.Comment: Baichuan 2 technical report. Github: https://github.com/baichuan-inc/Baichuan

arXiv.org e-Print Archive